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SUMMARY 


DNA sequencing of genomic cDNA clones of avian infectious bronchitis virus (IBV) has been carried out. 
770 bases have been determined which include genomic sequences spanning the 5’ termini of the two smallest 
mRNAs of the 3’-coterminal “nested” set: mRNA A and mRNA B. This region contains the complete coding 
sequences for mRNA B which are additional to those present in mRNA A. Two open reading frames are 


present, predicting proteins of M,s 7500 and 9500. 


INTRODUCTION 


Avian IBV, in common with other coronaviruses, 
has a single-stranded, polyadenylated, infectious 
RNA genome approx. 20 kb in length (Stern and 
Kennedy, 1980a). In infected cells multiple sub- 
genomic positive-stranded RNAs are produced 
(Siddell et al., 1983). For IBV and MHV these have 
been shown to consist of a 3’-coterminal “nested” 
set (Stern and Kennedy, 1980a; Lai et al., 1981; 
Leibowitz et al., 1981). For both IBV and MHV the 
messenger function of these subgenomic RNAs has 
been demonstrated (Rottier et al., 1981; Stern et al., 
1982; Siddell, 1983). However, certain differences of 
genome organisation have become apparent between 


Abbreviations: bp, base pairs; IBV, infectious bronchitis virus; 
kb, kilobases or kilobase pairs; MHV, murine hepatitis virus; 
mRNA, messenger RNA; ORF, open reading frame. 
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these two coronaviruses. First, MHV has six major 
subgenomic RNAs whereas IBV has only five (Stern 
and Kennedy, 1980b; Lai et al., 1981). Second, the 
coding function of the various messenger RNAs is 
different. Both IBV and MHV contain three main 
structural polypeptides: the nucleocapsid, the 
membrane (El), and the spike or peplomer (E2) 
polypeptides (Cavanagh, 1981; Siddell et al., 1983). 
In both systems the smallest RNA codes for the 
nucleocapsid protein but in MHV the next smallest 
RNA codes for the membrane polypeptide, whereas 
in IBV the membrane polypeptide is coded for by the 
third smallest RNA (Siddell et al., 1980; Siddell, 
1983; Rottier et al., 1981; Stern et al., 1982; Stern 
and Sefton, 1984). These differences are summarised 
in Fig. 1. 

The organisation of the messenger RNAs and in 
vitro translation studies have led to the hypothesis 
that the 5’-most sequences of each mRNA, which 
are not present in the next smallest mRNA, contain 
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Fig. 1. Sequence organisation of MHV and IBV mRNAs. For 
MRHV, mRNA | is the same length as the genomic RNA. For 
IBV, mRNA F is the same length as the genomic RNA. For IBV 
the approximate sizes of RNAs A, B, C, D and E are 2, 2.4, 3.4, 
4.1 and 7.8 kb (see MATERIALS AND METHODS, scction ¢). 
The coding assignments are shown at the left-hand end of each 
mRNA. N, nucleocapsid; M, membrane or E1 polypeptide; S, 
spike or peplomer polypeptide. The heavy bar shows the region 
of sequence presented in this paper. Also shown is the position 
of clone C5.136. 


the complete coding sequences for the major protein 
product produced by that messenger species (Stern 
and Kennedy, 1980b; Lai et al., 1981). However, 
since the only coronavirus sequences published are 
those of the smallest RNA species (Armstrong et al., 
1983; Skinner and Siddell, 1983) it has not been 
possible to examine this hypothesis at the RNA 
sequence level. RNA sequence data from this region 
might also enable us to predict the properties and 
thus aid the identification of possible polypeptides 
coded for by mRNA B. 

In this paper we report the nucleotide sequence of 
a cloned cDNA copy of IBV genomic RNA in the 
region corresponding to the 5’ end of mRNA B. The 
sequence shows that the 5’-most sequences of 
mRNA B could code for a hydrophobic 7.5-kDal 
protein. 


MATERIALS AND METHODS 
(a) Cloning of IBV genomic RNA 


The preparation of cDNA clones has been pre- 
viously described (Brown and Boursnell, 1984). 
Briefly, virion RNA was isolated from IBV strain 
Beaudette grown in embryonated eggs. cDNA was 
produced by oligo(dT)-primed reverse transcription 


of the RNA, followed by self-primed reverse tran- 
scription to generate the second strand. $1 nuclease- 
treated cDNA was dC-tailed using terminal transfer- 
ase, annealed to dG-tailed PstI-cleaved pAT153 
(Twigg and Sherratt, 1980) and transformed into 
Escherichia coli HB101. Ampicillin-sensitive colonies 
were selected for further characterisation. 


(b) Characterisation of cDNA clones 


Viral clones were identified by hybridisation with 
a probe prepared by polynucleotide kinase labelling 
of alkali-treated, full-length IBV genomic RNA. 
Restriction sites were mapped on a series of clones 
and this enabled construction of a continuous map, 
3.3 kb in length. That these included the poly(A) 
sequences at 3’-terminus of the viral genome was 
confirmed by hybridisation with a kinase-labelled 
poly(U) probe. 


(c) Formaldehyde-agarose gel analysis of IBV 
mRNAs 


1.5% formaldehyde-agarose gels were run essen- 
tially as described by Maniatis et al. (1982). Total 
RNA samples from IBV-infected chick kidney cell 
cultures were run overnight at 60 V on 16 cm vertical 
gels. IBV mRNAs were detected by blotting onto 
nitrocellulose and probing with nick-translated clon- 
ed IBV sequences (Maniatis et al., 1982). M,s were 
calculated by comparison with the mobilities of 
DNA restriction fragments and £. coli and chicken 
ribosomal RNAs. 


(d) DNA sequence determination 


Plasmid DNA was prepared by a modification of 
the method of Holmes and Quigley, 1981. DNA 
restriction fragments, 3’ end-labelled with [a- 
32P]dNTPs using Klenow polymerase or 5’ end- 
labelled with [)-??P]ATP using T4 polynucleotide 
kinase, were sequenced essentially as described by 
Maxam and Gilbert (1980). The depurination 
reaction was carried out in 66° formic acid for 
10 min at 20°C, after which the samples were treated 
in the same way as the pyrimidine reaction. For 
sequencing some regions of the DNA, restriction 
digests of the viral insert were recloned into the 
plasmid pUC9 allowing sequencing from adjacent 


vector restriction sites (Messing and Vieira, 1982). 
Sequence data were stored and analysed on an 
Apple Ile microcomputer using the programs of Lar- 
son and Messing (1983) and on a VAX 11/780 mini- 
computer using the programs of Staden (1984). 


RESULTS 


770 bp of DNA sequence from one IBV genomic 
clone, C5.136 (Brown and Boursnell, 1984) have 
been determined. This sequence corresponds to the 
genomic RNA sequence stretching from 2.40 kb to 
1.63 kb from the 3’ end of the viral genome. In 
Fig. 2a the arrows show the direction and extent of 
DNA sequence information obtained from individual 
restriction enzyme cleavage sites. Fig. 2b shows 
positions of restriction sites used in the sequencing. 
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Fig. 2. Sequencing strategy for cloned IBV cDNA (clone 
C5.136). (a) Arrows show the direction and extent of sequence 
information obtained from individual restriction sites. Arrows 
starting with solid circles indicate sequencing of DNA 3’ end- 
labelled with Klenow polymerase. Arrows starting with open 
circles indicate sequencing of DNA 5’ end-labelled with polynu- 
cleotide kinase. (b) A map of restriction sites used in the sequenc- 
ing. (c} The locations of termination codons (vertical bars) and 
potential initiation codons (bars with open circles on top) in the 
three possible translational reading frames. The heavy black 
lines show the main ORFs: M, 7500, M, 9500, and the putative 
nucleocapsid ORF. Also shown are the positions of the 5’ ends 
of mRNAs A and B as determined by S1 nuclease mapping (see 
Fig. 3). 
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95% of the sequence has been determined on both 
strands, and on each strand most regions have been 
sequenced more than once from different restriction 
sites. 

Fig. 2c shows the positions of the initiation and 
termination codons in the three reading frames and 
the positions of the 5’ ends of mRNAs A and B as 
determined by $1 nuclease mapping (Brown and 
Boursnell, 1984). It should be noted that $1 nuclease 
mapping will determine the 5’ end of the “body” of 
the mRNA (see DISCUSSION). The DNA sequence 
of 770 nucleotides, with a translation of the three 
main ORFs, is shown in Fig. 3. The lines below the 
sequence at positions 131-165 and 440-466 indicate 


15 30 45 60 15 
TACCTTTCAAGTAGATAATGGAAAAGTCTACTACGAAGGAACACCAGTTTTCCAAAAAGGTIGTIGTAGGATGTG 


El 


90 105 129 135 150 
GTCCAATTATAAGAAAGAATAATTGAACCACCTACTACACTTATTTTTATAAGAGGTGTITTACTTAACAAAAAG 


210 
TTARCAMATACaGAEDATGARAT¢cTGAETAGTTGHAGAGEAGTATTICTTGTTATAATCEETACTATE 
RAVIS CYKSLLL 


240 255 279 
aacreaacrTAGAGGTTAgaracer TART TTAGATCACQGALTACTACGCGTTTAAGoTGTAGTAGGCGeY 
Qk R VLD Tb GLLRVLTCSRRYV 


315 330 345 
SCTTTTAGTTCAATTAGATITAGTITATAGSTTGGCGTATACECECACCCAATCGCTGGEATGAATAATAGTAAL 
LeLVQtLoOD YRULAYTPTQSLA 

_ NN S K 


Al 


390 405 420 435 459 
GATAATCCTTTTCGCGGAGCAATAGCAAGAAAAGCTCGAATT TATCTGAGAGAAGGAT TAGATTGIGTTTACTTT 
DNPFRGATARK AREY ULREGLOC YY F 


465 480 495 519 525 
CTTAACAAAGCAGGACAAGCAGAGCCTTGTCCCGCGTGTACCTCTCTAGTAT TCCAAGGGAAAACTTGTGAGGAA 


LE NKAGQAEPCPACTSLVF QGKTCEE 


540 555 570 585 609 
CACATACATAATAATAATCTTTTGTCATGGCAAGCGGTAAAGCAGCTGGAAAAACAGACGCCCCAGCGCCAGTC: 
MASGKAAGKTOAPAPYV I 

H I HNN WNLLESWQAYVYKQLEXKQTPQR|QS 


aH 630 645 $69 87% 
TTAAACTAGGAGGACCAAAACCACCTAAAGTCGGTTCTTCTGGAAATGCATCTTGGTT TCAAGCAATAAAAGCC, 
KLG&@GPK PPK Y &@S$ S$ GNAS WEF QA TI XK AK 
L oN 


729 735 750 
nataaraKaTACACTCcaCccaAarTEAAGGTAGCGTGTTECTGATAAGGAAAACATTAnGCCAAGCCA 
« LNT PPP 6S 6 PONENIKPSQQ 


AACATGGATACTGGAGACGCC 
HG YWRR 

Fig. 3. 770-bp sequence of part of IBV cDNA clone C5.136. A 
translation of the three main ORFs is shown in single-letter 
amino acid code. Termination codons for these ORFs are shown 
as asterisks. Arrows above the sequence show the 5’ ends of 
mRNAs A and B as determined by S1 nuclease mapping. Lines 
below the sequence at these points indicate the regions of 
homology which occur at the 5’ ends of mRNAs A and B. 
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the regions of homology which occur at the 5’ end 
of mRNAs A and B and the arrows above the 
sequence at these points show the 5’ ends of the 
bodies of the mRNAs as determined by S1 nuclease 
mapping (Brown and Boursnell, 1984). 

In the genomic sequence immediately 5'-wards of 
the end of mRNA B, there are no ORFs longer than 
81 bases. About 30 bases into mRNA B, at position 
167, there is an AUG codon, followed by an ORF 
of 195 bases. This predicts a protein of M, 7500. The 
coding sequences for this putative protein are entirely 
contained in that part of mRNAB which is not 
present in mRNA A. One base before the termi- 
nation codon for this M, 7500 ORF, at position 361, 
there is an initiation codon followed by an open 
reading frame which could code for a protein of M, 
9500. This ORF is not contained within the “unique” 
sequences of mRNA B and overlaps considerably 
with mRNA A. In mRNA A, 100 bases from the 5’ 
terminus, at position 552, is the beginning of an ORF 
which extends beyond the limits of the sequenced 
region. This is likely to be the start of the gene coding 
for the nucleocapsid protein. A comparison of this 
partial sequence to the sequences published for the 
nucleocapsid proteins of MHV-A59 and MHV- 
JHM7 (Armstrong et al., 1983; Skinner and Siddell, 
1983) reveals no significant homology at the RNA or 
protein level. Although some homology might be 
expected it should be noted that no serological cross- 
reaction between IBV and any of the mammalian 
coronaviruses has been reported (Siddell et al., 
1983). 


DISCUSSION 


The region of the IBV sequence presented in this 
paper contains the 5’ ends, on the viral genome, of 
mRNAs A and B. The messenger RNAs of corona- 
virus IBV are probably transcribed from non-con- 
tiguous regions of the viral genome. Work on the 
murine coronavirus MHV has shown that a common 
5’ leader sequence, originating from the extreme 5’ 
end of the viral genome, is fused to the body of each 
messenger RNA (Laiet al., 1982; 1983; Spaan et al., 
1983; Baric et al., 1983). It is likely that the IBV 
messenger RNAs have the same structure. However, 
whether or not leader sequences are present, S1 
nuclease mapping experiments have shown that the 


ends of the bodies of mRNAs A and B lie at positions 
139 and 443 respectively (Brown and Boursnell, 
1984) as shown in Fig. 3. 

Of the two ORFs which are present at the 5’ end 
of mRNA B, the M, 7500 ORF seems to be the most 
likely candidate for translation in vivo. The location 
of the coding sequence for the putative M, 7500 
polypeptide fits in well with the hypothesis that the 
major polypeptide product of each mRNA is trans- 
lated from those 5’ sequences not present in the next 
smallest RNA. If the M, 9500 polypeptide were 
translated then this would no longer be true, since its 
coding region stretches well into mRNA A. Further- 
more, the RNA sequences flanking the initiation 
codons for the putative M, 7500 and nucleocapsid 
genes correspond well to those preferred for 
functional eukaryotic initiation codons (Kozak, 
1983). However, the sequence GNNAUGA around 
the AUG codon at the start of the M, 9500 ORF is 
rare in this context. In addition the AUG codon at 
the start of the M, 7500 ORF is the first initiation 
codon to occur in the body of mRNA B; since initi- 
ation of translation at anything other than the first 
AUG codon is known to be rare (Kozak, 1983) this 
suggests that this ORF codes for the major product 
of mRNA B. 

The amino acid sequence of the putative 47, 7500 
polypeptide shows it to be hydrophobic in nature and 
to have an unusual composition in that 26°, (17 out 
of 65) of its residues are leucine. Of the six possible 
triplets coding for leucine, one (UUA) is used 8 times 
out of 17. This unusual composition and codon bias, 
which would not be expected from a chance ORF, 
suggests that this polypeptide is translated in vivo. A 
computer analysis of the sequences presented here 
has been carried out using the program ANALYSEQ 
(Staden, 1984). This program uses certain criteria to 
select one of the three reading frames as being the 
most likely protein coding frame. Although better 
suited to analysing large ORFs, it is interesting to 
note that a search based on looking for codon biases 
above those expected from the base composition 
selects 85° of the M, 7500 ORF as the most likely 
coding frame. The 15°, of codons not selected are 
not in a single block, which might have suggested a 
sequencing error leading to an artificial frameshift. 
All of the nucleocapsid which has so far been se- 
quenced was selected as the most likely coding 
frame, but none of the M, 9500 ORF. 


We have carried out in vitro translation of total 
and poly(A)+ RNA populations from IBV-infected 
chick kidney cell cultures using a rabbit reticulocyte 
lysate system. However, analysis of the products on 
10-18% polyacrylamide gradient gels containing 
urea could not resolve any small polypeptides due to 
high background in the relevant low-M, range. A 
similar problem has been found by Stern and Sefton 
(1984; Stern, D.F., personal communication) who 
have carried out in vitro translation studies of gel- 
purified and gradient-fractionated mRNAs and have 
identified no major specific product from mRNAs B 
or D. 

At present, therefore, it is not possible to say 
definitively whether either of these polypeptides is 
produced. In a search for small structural polypep- 
tides, a [2H Jleucine-labelled preparation of virus has 
been analysed on a 12.5% polyacrylamide tube gel 
(Cavanagh, D., personal communication), using the 
phosphate buffer system of Swank and Munkres 
(1971). This showed that there were three detectable 
polypeptides of apparent M,s: 16000, 12000, and 
10000. The percentage of the total counts in the gel 
accounted for by these polypeptides was <1%. 
Thus, if either of these ORFs codes for a structural 
polypeptide, it must only be present in very small 
quantities. It is also possible that they could code for 
non-structural polypeptides. These would be difficult 
to identify in IBV-infected cells by pulse-labelling 
techniques without using immunoprecipitation to 
lower the background of host-cell incorporation 
which is poorly shut off by IBV infection. The availa- 
bility of sequence data for these putative polypep- 
tides, however, opens up the possibility of using 
immunoprecipitation with antisera prepared against 
synthetic oligopeptides to search for the presence of 
these polypeptides in IBV-infected cells. 
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